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Abstract The multiscale variance stabilization Transform (MSVST) has recently been proposed for Poisson 
data denoising (jZhang et al.ll2008al ). This procedure, which is nonparametric, is based on thresholding wavelet 
coefficients. The restoration algorithm applied after thresholding provides good conservation of source flux. We 
present in this paper an extension of the MSVST to 3D data — in fact 2D- ID data — when the third dimension 
is not a spatial dimension, but the wavelength, the energy, or the time. We show that the MSVST can be 
used for detecting and characterizing astrophysical sources of high-energy gamma rays, using realistic simulated 
observations with the Large Area Telescope (LAT). The LAT was launched in June 2008 on the Fermi Gamma- 
ray Space Telescope mission. Source detection in the LAT data is complicated by the low fluxes of point sources 
relative to the diffuse celestial foreground, the limited angular resolution, and the tremendous variation in that 
resolution with energy (from tens of degrees at ~30 MeV to ~0.1° at 10 GeV). The high-energy gamma-ray sky 
is also quite dynamic, with a large population of sources such active galaxies with accretion-powered black holes 
producing high-energy jets, episodically flaring. The fluxes of these sources can change by an order of magnitude 
or more on time scales of hours. Perhaps the majority of blazars will have average fluxes that are too low to be 
detected but could be found during the hours or days that they are flaring. 

The MSVST algorithm is very fast relative to traditional likelihood model fltting, and permits efficient detection 
across the time dimension and immediate estimation of spectral properties. Astrophysical sources of gamma rays, 
especially active galaxies, are typically quite variable, and our current work may lead to a reliable method to 
quickly characterize the flaring properties of newly-detected sources. 
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Processing 

The high-energy gamma-ray sky will be studied with 
unprecedented sensitivity by the Large Area Telescope 
(LAT), which was launched by NASA on the Fermi mis- 

sion in June 2008. The catalog of gamma-ray sources from 

Send offprint requests to: jstarck@cea.fr the previous mission in this energy range, EGRET on the 
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Compton G amma-Ray Observato ry, has approximately 
270 sources (jHartman et al.lll999t ). For the LAT, several 
thousand gamma-ray sources are expected to be detected, 
with much more accurately determined locations, spectra, 
and light curves. 

We would like to reliably detect as many celestial 
sources of gamma rays as possible. The question is not 
simply one of building up adequate statistics by increas- 
ing exposure times. The majority of the sources that the 
LAT will detect are likely to be gamma-ray blazars (dis- 
tant galaxies whose gamma-ray emission is powered by 
accretion onto supermassive black holes), which are in- 
trinsically variable. They flare episodically in gamma rays. 
The time scales of flares, which can increase the flux by a 
factor of 10 or more, can be minutes to weeks. The duty 
cycle of flaring in gamma rays is not well determined yet, 
but individual blazars can go months or years between 
flares and in general we will not know in advance where 
on the sky the sources will be found. 

The fluxes of celestial gamma rays are low, especially 
relative to the ^1 m^ effective area of the LAT (by far the 
largest effective collecting area ever in the GeV range). 
An additional complicating factor is that diffuse emission 
from the Milky Way itself (which originates in cosmic- 
ray interactions with interstellar gas and radiation) makes 
a relatively intense, structured foreground emission. The 
few very brightest gamma-ray sources will provide approx- 
imately 1 detected gamma ray per minute when they are 
in the field of view of the LAT. The diffuse emission of the 
Milky Way will provide about 2 gamma rays per second, 
distributed over the ^2 sr field of view. 

For previous high-energy gamma-ray missions, the 
standard method of source detection has been model fit- 
ting — maximizing the likelihood function while moving 
trial point sources around in the region of the sky be- 
ing analyzed. This approach has been driven by the lim- 
ited photon counts and the relatively limited resolution of 
gamma-ray telescopes. However, at the sensitivity of the 
LAT, even a relatively "quiet" part of the sky may have 
10 or more point sources close enough together to need to 
be modeled simultaneously when maximizing the (compu- 
tationally expensive) likelihood function. For this reason 
and because of the need to search in time, non-parametric 
algorithms for detecting sources are being investigated. 



Literature overview for Poisson denoising using wavelets 

A host of estimation methods have been proposed in 
the literature for non-parametric Poisson noise removal. 
Major contributions consist of variance stabilization: 
a classical solution is to preprocess the data by apply- 
ing a variance stabili zing transform (VST) such as the 
Anscombe transform ( Anscombe 1 9481 ) ( Donohol [l993l ) . It 
can be shown that the transformed data are approxi- 
mately stationary, independent, and Gaussian. However, 
these transformations are only valid for a sufficiently large 
number of counts per pixel (and of course, for even more 
counts, the Poisson distribution becomes Gaussian with 



equal mean and variance) (jMurtagh et al.lll995r ). The nec- 
essary average number of counts is about 20 if bias is to 
be avoided. 

In this case, as an alternative approach, a filtering ap- 
proach for very small numbers of counts, i ncluding fre- 
quen t zero cases, has been proposed in (jStarck fc Pierre! 
19981 ). which is based on the popular isotropic undec- 



imated wavelet transform (implemented with th e so- 
called a trous algorithm) ( Starck fc Murtagh 20061 ) and 
the autoconvolution histogram technique for deriving 
the probabi lity density funct i on (pdf) of the wavelet 
coefficient (Slezak et al. 1993t Bijaoui fc Jammall 2001 



Starck fc Murtaghl 20061 ). This method is par t of the data 



reduc tion pipeline of the XMM-LSS project 



2004 ) for detecting of clusters of galaxies 



Pierre et al 



Pierre et al 



20071 ). This algorithm is obviously a good candidate for 



Fermi LAT 2D map analysis, but its extension to 2D- 
ID data sets does not exist. It is far from being trivial, 
and even if it were possible, computation time would cer- 
tainly be prohibitive to allow its use for Fermi LAT 2D- 
ID data sets. T hen, an alterna t ive approach is needed. 
Several authors iKqlaczYk|ll997|: IXimmermann fc NowakI 

Biiaoui fc JammaT200ll : 



1999; Nowak fc Baraniuk 1999; 



iFrvzlewicz fc NasonI 120041: IZhang et alj |2008b[) have sug- 
gested that the Haar wavelet transform is very well- 
suited for treating data with Poisson noise. Since a Haar 
wavelet coefficient is just the difference between two ran- 
dom variables following a Poisson distribution, it is eas- 
ier to derive mathematical tools for removing the noise 
than with any other wavelet method. IStarck fc Murtagh 
( 2006f ) study shows that the Haar transform is less effec- 
tive for restoring X-ray astronomical images than the a 
trous algorithm. The reason is that the wavelet shape of 
the isotropic wavelet transform is much better adapted 
to astronomical sources, which are more or less Gaussian- 
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shaped and isotropic, than the Haar wavelet. Some papers 
(IScarglc 1998; K olaczvk &: Nowakll20o4 IWillet fc Nowa5 
|2005[ jWillettii2006l ) proposed a spatial partitioning, possi- 
bly dyadic, of the image for complicated geometrical con- 
tent recovery. This dyadic partitioning concept is however 
again not very well suited to astrophysical data. 

The MSVST alternative 



In a recent paper, IZhang et al.1 (|2008al ) have proposed 
to merge a variance stabilization technique and the multi- 
scale decomposition, leading to the Multi-Scale Variance 
Stabilization Transform (MSVST). In the case of the 
isotropic undecimated wavelet transform, as the wavelet 
coefficients wj are derived by a simple difference of two 
consecutive dyadic scales of the input image (see sec- 



tion 



a,_i 



the stabilized wavelet coeffi- 



cients are obtained by applying a stabilization on both 



ij-i and aj, wj 



Aj-i{aj^i) — Aj{aj), where Aj 



and 

Aj are non-linear transforms that can be seen as a gen- 
eralization of the Anscombe transform; see section [3] for 
details. This new method is fast and easy to implement, 
and more importantly, works very well at very low count 
situations, down to 0.1 photons per pixel. 



This paper 

In this paper, we present a new multiscale representa- 
tion, derived from the MSVST, which allows us to remove 
the Poisson noise in 3D data sets, when the third dimen- 
sion is not a spatial dimension, but the wavelength, the en- 
ergy or the time. Such 3D data are called 2D-1D data sets 
in the sequel. We show that it could be very useful to ana- 
lyze Fermi LAT data, especially when looking for rapidly 
time varying sources. Section [2] describes the Fermi LAT 
simulated data. Section [3] reviews the MSVST method rel- 
ative to the isotropic undecimated wavelet transform and 
section 0] shows how it can be extended to the 2D-1D case. 
Section [5] presents some experiments on simulated Fermi 
LAT data. Conclusions are given in section [6l 

Definitions and notations 

For a real discrete-time filter whose impulse response 
is h[i], h[i] = i], i G Z is its time-reversed version. 
For the sake of clarity, the notation h[i] is used instead 
of hi for the location index. This will lighten the notation 



by avoiding multiple subscripts in the derivations of the 
paper. The discrete circular convolution product of two 
signals will be written and the continuous convolution 
of two functions *. The term circular stands for periodic 
boundary conditions. The symbol S[i] is the Kronecker 
delta. 

For the octave band wavelet representation, analysis 
(respectively, synthesis) filters are denoted h and g (re- 
spectively, h and g). The scaling and wavelet functions 
used for the analysis (respectively, synthesis) are denoted 
(j> (with (^(f ) = J2k h[k]H^ - k),x eR and fc e Z) and 
(with V(f ) = Efe5W'/'(2^ - k),x €R and k € Z) (re- 
spectively, (f) and We also define the scaled dilated 
and translated version of at scale j and position fc as 
4'j,k{x) = (j)(2~^ X — fc), and similarly for ip, (j) and V'- 
A function f{x,y) is isotropic if it is constant along all 
points (x, y) that are equidistant from the origin. 

A distribution is stabilized if its variance is made con- 
stant, typically equal to 1, independently of its mean. A 
transformation applied to a random variable is called a 
variance stabilizing transform (VST), if the distribution 
of the transformed variable is stabilized and is approxi- 
mately Gaussian. 

Glossary 

WT Wavelet Transform 

DWT Discrete (decimated) Wavelet Transform 

UWT Undecimated Wavelet Transform 

lUWT Isotropic Undecimated Wavelet Transform 

VST Variance Stabilization Transform 

MSVST Multi-Scale Variance Stabilization Transform 

LAT Large Area Telescope (LAT) 

FDR False Discovery Rate 

2. Data description 

2.1. Fermi Large area telescope 

The LAT (Fig. [T]) is a photon-counting detector, con- 
verting gamma rays into positron-electron pairs for de- 
tection. The trajectories of the pair are tracked and their 
energies measured in order to reconstruct the direction 
and energy of the gamma ray. 

The energy range of the LAT is very broad, approxi- 
mately 20 MeV - 300 GeV. At energies below a few hun- 
dred MeV, the reconstruction and tracking efficiencies are 
lower, and the angular resolution is poorer, than at higher 
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Figure 1. Cutaway view of the LAT. The LAT is modu- 
lar; one of the 16 towers is shown with its tracking planes 
revealed. High-energy gamma rays convert to electron- 
positron pairs on tungsten foils in the tracking layers. 
The trajectories of the pair are measured very precisely 
using silicon strip detectors in the tracking layers and the 
energies are determined with the Csl calorimeter at the 
bottom. The array of plastic scintillators that cover the 
towers provides an anticoincidence signal for cosmic rays. 
The outermost layers are a thermal blanket and microme- 
teoroid shield. The overall dimensions are 1.8 x 1.8 x 0.75 



energies. The point spread function (PSF) width varies 
from about 3.5° at 100 MeV to better than 0.1° (68% 
containment) at 10 GeV and above. Owing to large-angle 
multiple scattering in the tracker, the PSF has broad tails; 
the 95%/68% containment ratio may be as large as 3. 

Wavelet denoising of LAT data has application as part 
of an algorithm for quickly detecting celestial sources of 
gamma rays. The fundamental inputs to high-level anal- 
ysis of LAT data will be energies, directions, and times 
of the detected gamma rays. (Pointing history and instru- 
ment live times are also inputs for exposure calculations.) 
For the analysis presented here, we consider the LAT data 
for some range of time to have been binned into 'cubes' 
v{x,y,t) of spatial coordinates and time or, v(x,y,E) of 
spatial coordinates and energy, because, as we shall see, 
the wavelet denoising can be applied in multiple dimen- 
sions, and so permits estimation of counts spectra. The 
motivations for filtering data with Poisson noise in the 
wavelet domain are well known — sources of small angular 
size are localized in wavelet space. 

2.2. Simulated LAT data 

The application of MSVST to problems of detection 
and characterization of LAT sources was investigated us- 
ing simulated data. The simulations included a realistic 
observing strategy (sky survey with the proper orbital and 
rocking periods) and response functions for the LAT (ef- 
fective area and angular resolution as functions of energy 
and angle). Point sources of gamma rays were defined with 
systematically varying fiuxes, spectral slopes, and/or flare 
intensities and durations. The simulations also included a 
representative level of diffuse 'background' (celestial plus 



residual charged-particle) for regions of the sky well re- 
moved from the Galactic equator, where the celestial dif- 
fuse emission is particularly intense. The denoising results 
reported in Section [5] use a data cube obtained according 
to this simulation scenario. 

3. The 2D multiscale variance stabilization 
transform (MSVST) 

In this section 



(jZhang et al 



^we review the MSVST method 

2008al ). restricted to the Isotropic 
Undecimated Wavelet Transform (lUWT). Indeed, 
the MSVST can use other transforms such as the stan- 
dard three-orientation undecimated wavele t transform, 
the rid gelet or the curvelet transforms; see ( Zhang et al.l 
l2008a ). In our specific case here, only the lUWT is of 
interest. 



3.1. VST of a filtered Poisson process 

Given X a sequence of n independent Poisson ran- 
dom variables Xi^i — 1, • • • , n, each of mean Ai, let Yi = 
be the filtered process obtained by con- 
volving the sequence X with a discrete filter h. Y denotes 
any one of the F^'s, and Tfc — '}2n{h[i\)^ for fc = 1, 2, • • • . 



If h 



6, then we recover the Anscombe VST 



(jAnscombd 119481 ) of Yi (hence Xi) which acts as if the 
stabilized data arose from a Gaussian white noise with 
unit variance, under the assumption that the intensity 
is large. This is why the Anscombe VST performs poorly 
in low-count settings. But, if the filter h acts as an "av- 
eraging" kernel (more generally a low-pass filter), one can 
reasonably expect that stabilizing Yi would be more ben- 
eficial, since the signal-to-noise ratio measured at the out- 
put of h is expected to be higher. 

Using a local homogeneity assumption, i.e. Ai_j — A 
for all 7 within th e support of h, it has been shown 
( Zhang et al. 2008al ) that for a non-negative filter h, the 
transform Z = h^Y^ 

7r2 r3 . 



c with 6 > and c > defined as 

8ti 2t2 V '''2 

is a second order accurate variance stabilization trans- 
form, with asymptotic unit variance. By second-order ac- 
curate, we mean that the error term in the variance of the 
stabilized variable Z decreases rapidly as 0(A~^). From 
(HI), it is obvious that when h = S, we obtain the classi- 
cal Anscombe VST parameters 6 = 2 and c — 3/8. The 
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authors in (jZhang et al.ll2008af ) have also proved that Z 
is asymptotically distributed as a Gaussian variate with 
mean h\pr\\ and unit variance. A non-pos itive h wit h 



a nega tive c could also be considered; see (jZhang et al. 
for more details. 



FigH] shows the Monte-Carlo estimates of the ex- 
pectation E[Z] (left) and the variance Var \Z\ (right) 
obtained from 2 • 10^ Poisson noise realizations of 
X, plotted as a func tion o f the intensity A for both 
Anscomb e (^ nscombj 1948fl (dashed-dotted) , Haar-Fisz 
(dashed) IrVvflewicz fc NasonI 12004 ) and our VST with 
the 2D i?3-Spline filter as a low-pass filter h (solid). The 
asymptotic bounds (dots) (i.e. 1 for the variance and \f\ 
for the expectation) are also shown. It can be seen that for 
increasing intensity, E[Z] and Var \Z\ approach the theo- 
retical bounds at different rates depending on the VST 
used. Quantitatively, Poisson variables transformed using 
the Anscombe VST can be reasonably considered to be 
unbiased and stabilized for A ^ 10, using Haar-Fisz for 
A ^ 1, and using out VST (after low-pass filtering with 
the chosen K) for A ^ 0.1. 

3.2. The isotropic undecimated wavelet transform 

The undecimated wavelet transform (UWT) uses an 
analysis filter bank (h, g) to decompose a signal ao into 
a coefficient set W — {di, . . . ,dj ,aj}, where dj is the 
wavelet (detail) coefficients at scale j and aj is the approx- 
imation coefficients at the coarsest resolution J. The pas- 
sage from one resolution to the next one is o btained using 
the " a trous" algorithm ([Holschneider et al.i , 1989.' ) (Shensal 
1992h 



Wj+l 



E 

k 

E 

k 



h[k]aj[l + 2^ k], 
g[k]aj[l + 2^ k], 



(2) 
(3) 



where [;] = h[l] if 1/2^ £ Z and other- 
wise, h[l] — h[~l], and 'V denotes discrete cir- 
cular convolution. The reconstruction is given by 



aj [I] 



(h}^ -k aj+i)[l] + (.gfJ ★ Wj+i) 



The filter 



bank [h, g, h, g) needs t o satisfy the so-called exact re- 
const ruction condition (jMallatl Il998t IStarck fc Murtagh 
20061) . 

The Isotropic UWT (lUWT) (Starck et al. l l2007l) uses 
the filter bank {h,g — 6 — h,h = 5,g = 6) where 



h is typically a symmetric low-pass filter such as the 
-Ba-Spline filter. The reconstruction is trivial, i.e., ao = 



aj 



This algorithm is widel y used in astro- 



nomical applications 



Starck et al. 119981) and biomedical 



imaging (|Olivo-Marinll2002f ) to detect isotropic objects. 

The lUWT filter bank in g-dimension (g > 2) becomes 
{hqD,gqD = 5 - /igD, hgD = S,gqD = S) whcre hqu is the 
tensor product of q ID filters /iid. Note that g^D is in 
general non-separable. 

3.3. MSVST with the lUWT 

Now the VST can be combined with the lUWT in the 
following way: since the filters h^^ at all scales j are low- 
pass filters (so have nonzero means), we can first stabilize 
the approximation coefficients aj at each scale using the 
VST, and then compute in the standard way the detail 
coefficients from the stabilized a^'s. Given the particular 
structure of the lUWT analysis filters {h,g), the stabiliza- 
tion procedure is given by 



(4) 



Note that the VST is now scale-dependent (hence the 
name MSVST). The filtering step on aj-i can be rewrit- 
ten as a filtering on oq = X, i.e., aj = ii^^^ * oq, where 
/lO) = /li-i ★ . . . * /ji * /i for j > 1 and h'^"^ = S. Aj is the 
VST operator at scale j 



lUWT ■ 




= /iTJ 






I 


= aj- 


1 - 


MSVST 


' flj 


= /iTJ 




lUWT 




= Aj- 


-i(aj-i) 



(5) 



Let us define rj^^ = J2^ Then according to 

the constants b^^^ and cf-') associated to /i*^^^ must be 
set to 



It. 



U) 



2t. 



^ 



The constants 6^^^ and c^^^ only depend on the filter h and 
the scale level j. They can all be pre-computed once for 
any given h. A schematic overview of the decomposition 
and the inversion of MSVST-^IUWT is depicted in Fig. El 
In summary, lUWT denoising with the MSVST in- 
volves the following three main steps: 
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Figure 2. Behavior of the expectation E[Z] (left) and variance Var [Z] (right) as a function of the underlying intensity, 
for the Anscombe VST, 2D Haar-Fisz VST, and out VST with the 2D Bg-Sphne filter as a low-pass filter h. 



Transformation : Compute the lUWT in conjunction 
with the MSVST as described above. 
Detection : Detect significant detail coefficients by 
hypothesis testing. The appeal of a binary hypoth- 
esis testing approach is that it allows quantitative 
control of significance. Here, we take benefit from 
the asymptotic Gaussianity of the stabilized Oj's that 
will be transferred to the w^'s as it has been shown 



by (jZhang et al.l l2008al ). Indeed, these authors have 



proved that under the null hypothesis Hq : Wj [k] = 
corresponding to the fact that the signal is homoge- 
neous (smooth), the stabilized detail coefficients wj fol- 
low asymptotically a centered normal dist ribution with 



an int ensity-independent variance; see (| Zhang et al 



2n08aL Theorem 1) for details. This variance depends 



only on the filter h and the current scale, and can be 
tabulated once for any h. Thus, the distribution of the 
uij's being known (Gaussian), we can detect the signif- 
icant coefficients by classical binary hypothesis testing. 
Estimation : Reconstruct the final estimate using the 
knowledge of the detected coefficients. This step re- 
quires inverting the MSVST after the detection step. 
For the lUWT filter bank, there is a closed-form inver- 
sion expression as we have 



ao = An 



Aj{aj) + 



J 



(7) 



3.3.1. Example 

Fig. 2] upper left shows a set of objects of different sizes 
and different intensities contaminated by a Poisson noise. 
Each object along any radial branch has the same inte- 
grated intensity within its support and has a more and 
more extended support as we go farther from the center. 
The integrated intensity reduces as the branches turn in 
the clockwise direction. Denoising such an image is chal- 
lenging. Fig. m top-right, bottom-left and right , show re- 
spect ively the filtered images by Haar-Kolaczyk ( Kolaczv^ 
1997t ). Haar-Jammal-Bijaoui ( Biiaoui fc Jammall l200lj ) 
and the MSVST. 

As expected, the relative merits (sensitivity) of the 
MSVST estimator become increasingly salient as we go 



farther from the center, and as the branches turn clock- 
wise. That is, the MSVST estimator outperforms its com- 
petitors as the intensity becomes low. Most sources were 
detected by the MSVST estimator even for very low counts 
situations; see the last branches clockwise in Fig.|4]bottom 
right and compare to Fig. [4] top right and Fig. |4] bottom 
left. 



4. 2D-1D MSVST denoising 

4.1. 2D-1D wavelet transform 

In the previous section, we have seen how a Poisson 
noise can be removed from 2D image using the lUWT 
and the MSVST. Extension to a qD data sets is straight- 
forward, and the denoising will be nearly optimal as long 
as each object belonging to this g-dimensional space is 
roughly isotropic. In the case of 3D data where the third 
dimension is either the time or the energy, we are clearly 
not in this configuration, and the naive analysis of a 3D 
isotropic wavelet does not make sense. Therefore, we want 
to analyze the data with a non-isotropic wavelet, where 
the time or energy scale is not connected to the spatial 
scale. Hence, an ideal wavelet function would be defined 
by: 



(8) 



where V^*^^*'-' is the spatial wavelet and -0^^-' is the temporal 
(or energy) wavelet. In the following, we will consider only 
isotropic and dyadic spatial scales, and we note ji the 
spatial resolution index (i.e. scale = 2-'i), j2 the time (or 
energy) resolution index. Thus, define the scaled spatial 
and temporal (or energy) wavelets 



2Ji ^2Ji 
^0(^)(— ). 



2^1 



) and 



Hence, 



we derive the wavelet coefficients 



Wj-^j^l^x, ky, kz] from a given data set D (k^ and ky 
are spatial index and kz a time (or energy) index). In 
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Figure 3. Diagrams of the MSVST combined with the lUWT. The notations are the same as those of (|1]) and ([7]). The 
left dashed frame shows the decomposition part. Each stage of this frame corresponds to a scale j and an application 
of (HI). The right dashed frame illustrates the direct inversion ([7]). 



Figure 4. Top, XMM simulated data, and Haar-Kolaczyk ( Kolaczvk 1997t ) filtered image. Bottom, Haar-Jammal- 
Bijaoui (|Biiaoiii fc .Tammalll200lh and MSVST filtered images. Intensities logarithmically transformed. 



continuous coordinates, this amounts to the formula 



1 1 



-\-oo 



Hence, we have a 2D-1D undecimated wavelet repre- 
sentation of the input data D: 



Dix,y,z) 



Ji 



^ \ 2h ' 2J1 J ^ V jj 



Dlkx, ky, kz] — iJi,j2 [kxi ky, k^] + ^ ^ ^ji,J2 [kxi ky, k^] 



dxdydz 



where * is the convolution and ^j{x) — ijj{—x). 



Ji Ji 

''^Jl,i^XTky^kz\ + Wjj^ j2 [/Ca;, /Cj,, /Cz] . 

i2 = l Jl = lj2 = l 

(13) 



From this expression, we distinguish four kinds of co- 
Fast undecimated 2D-1D decomposi- efficients: 
tion / reconstruction 



In order to have a fast algorithm for discrete data, we 
use wavelet functions associated to filter banks. Hence, 
our wavelet decomposition consists in applying first a 2D 
lUWT for each frame k^,. Using the 2D lUWT, we have 
the reconstruction formula: 

D\kx, ky, kz\ = ajj \kx, ky\ ~\- ^ ^ '^ji \kx^ ky, Vfc, ,(10) 

where Ji is the number of spatial scales. Then, for each 
spatial location {kx, ky) and for each 2D wavelet scale scale 
ji, we apply a ID wavelet transform along z on the spatial 
wavelet coefficients Wj-^ [kx, ky, k^] such that 



Detail-Detail coefficients {ji < Ji and j2 < J2)' 

Wj^j^[kx,ky,k^] = (5-ft,iD)* 

-k aj-^^i[kx , ky, .] ^ * iji [kx T ky, .] 



(14) 



Approximation-Detail coefficients (ji = Ji and j2 < 
J2): 

^Ji,32 \kxi ky, kz\ ~ * Q^Ji \kx, ky, .] h^\Y) * ^Ji \kx^ ky, .](15) 

Detail-Approximation coefficients (ji < Ji and j2 = 
J2): 



1 , J2 \kx T ky, kz] — ^iD ^ * 1 [kx 1 ky, .] ^ * ^ji jky, - J^ld) 



{J2) 



n[kx,ky,kz]^w,„j,[kx,ky,kz]+Y,w,,^,,[kx,ky,kz], V(fc^-, 4^?^™^*^°^"^??^™^*^°^ ^o^^^^^"^*^ (-^'i = 

and j2 = J2): 



J2 = l 



where J2 is the number of scales along z. The same o,Ji,J2[kx, ky, k^] = h^j^ * aj-^[kx, ky, . 
processing is also applied on the coarse spatial scale 
o-Ji [kx, ky, kz], and we have 



(17) 



As the 2D-1D undecimated wavelet transform just de- 
scribed is fully linear, a Gaussian noise remains Gaussian 
after transformation. Therefore, all thresholding strategies 

aj, [kx , ky , kz] = aj„j, [kx ,ky,kz]+Y, ,n [^x , ky , kz] , V(^h^Mtye been developed for wavelet Gaussian denoising 

are still valid with the 2D-1D wavelet transform. Denoting 
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TH the thresholding operator, the denoised cube in the 
case of additive white Gaussian noise is obtained by: 



Approximation- Approximation coefficients (ji 
and j2 = J2): 



h h 



aji,.l2[kx 



Ji 



\kx , ky , .] 



= Jl 



(22) 



Tll{wj^^j^[kx, ky, kz]) 



are now 



J2 



J2 = l 



k k 

r\jy , r\jz 



J2 



Jl 

ii2=i 



Hence, all 2D-1D wavelet coefficients 
stabilized, and the noise on all these wavelet coefficients 
is Gaussian with known scale-dependent variance that de- 
Tli{wj-^j^[kx, ky, kzJ^Qj^f^g solely on h. Denoising is however not straightfor- 
ward because there is no explicit reconstruction formula 
available because of the form of the stabilization equations 



(18) 



A typical choice of TH is the hard thresholding operator, 
i.e. TH(x) = if \x\ is below a given threshold r, and 
TH(a;) = a; if \x\ > r. The threshold r is generally cho- 
sen between 3 and 5 times the noise standard deviation 



( Starck fc Murtaghll200 



4.2. Variance stabilization 

Putting all pieces together, we are now ready to plug 
the MSVST into the 2D-1D undecimated wavelet trans- 
form. Again, we distinguish four kinds of coefficients that 
take the following forms: 

— Detail-Detail coefficients (ji < Ji and j2 < J2): 

Wji,j2[kx,ky,kz] = ((^-ft-m)* A'i-i,i2-i 



"id * 



CLj^ —1 {kx , ky 



A 



^ID ^ ^jl ; ky 



(19) 



The schematic overview of the way the detail coeffi- 
cients Wj-^_j^ are computed is illustrated in Fig.[5l 
Approximation-Detail coefficients [ji = Ji and j2 < 
J2): 



'^Jl,j2\kx^ ky, kz] — A.J-^j2-l 



^ID ^Jl[kxT ky, . 



A 



Jl ,32 



f^Y£)^ * 0-Jl [kxj ky, ■ 



(20) 



Detail-Approximation coefficients {ji < Ji and j2 
J2): 



\{J2) 



■^jl 



jl,J2 



ID Jl L ' y' ■ 



(21) 



above. Formally, the stabilizing operators Aj^^^ and the 
convolution operators along (x, y) and z do not commute, 
even though the filter bank satisfies the exact reconstruc- 
tion formula. To circumvent this difficulty, we propose 
to solve this reconst ruction problem by d efining the mul- 
tiresolution support (jMurtagh et al.lll995l ) from the stabi- 
lized coefficients, and by using an iterative reconstruction 
scheme. 



4.3. Detection-reconstruction 

As the noise on the stabilized coefficients is Gaussian, 
and without loss of generality, we let its standard devi- 
ation equal to 1, we consider that a wavelet coefficient 
Wj-^j^lkx, ky, kz] is significant, i.e., not due to noise, if its 
absolute value is larger than a critical threshold r, where 
T is typically between 3 and 5. 

The multiresolution support will be obtained by de- 
tecting at each scale the significant coefficients. The mul- 
tiresolution support for ji < J and j2 < J2 is defined 
as 



M. 



31,32 



\kx, ky, kz] — 



1 if Wj^ j2 [kx ,ky,kz] is significan^^^^ 
otherwise. 



In words, the multiresolution support M indicates at 
which scales (spatial and time/energy) and which posi- 
tions, we have significant signal. We denote W the 2D-1D 
undecimated wavelet transform described above, the in- 
verse wavelet transform and Y the input noisy data cube. 

We want our solution X to preserve the significant 
structures in the original data by reproducing exactly the 
same coefficients as the wavelet coefficients of the input 
data Y , but only at scales and positions where signif- 
icant signal has been detected (i.e. MWX = MWY). 
At other scales and positions, we want the smoothest so- 
lution with the lowest budget in terms of wavelet coef- 
ficients. Furthermore, as Poisson intensity functions are 
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Figures. Overview of MSVST with the 2D-1D lUWT. The diagram summarizes the main steps for computing the 
detail coefficients Wj^j^ CH)- The notations are exactly the same as those of subsection 14.21 with giD — S — hiu- 



positive by nature, a positivity constraint is imposed on 
the solution. It is clear that there are many solutions sat- 
isfying the positivity and multiresolution support consis- 
tency requirements, e.g. Y itself. Thus, our reconstruction 
problem based solely on these constraints is an ill-posed 
inverse problem that must be regularized. Typically, the 
solution in which we are interested must be sparse by in- 
volving the lowest budget of wavelet coefficients. Therefore 
our reconstruction is formulated as a constrained sparsity- 
promoting minimization problem that can be written as 
follows 



mm 

X 



subject to 



(mWX = MWY 
I and X > 



(24) 

where || . ||i is the £i-norm playing the role of rcgulariza- 
tion and is well known to promote sparsity (|Donoho..2004l . 
This problem can be solved efficiently using the hybrid 
steepe st descent algorithm (Yamada 2001 : Zhang et al.l 
2008al) . and requires about 10 iterations in practice. 
Transposed into our context, its main steps can be sum- 
marized as follows: 

Require: Input noisy data Y; a low-pass filter h; mul- 
tiresolution support M from the detection step; num- 
ber of iterations A'max- 
1: Initialize X(o) = MWY ^ Mwy, 

for t — 1 to iVniax do 

d = MwY + (1 - Af)WX(*-i), 

xW^P+(n STft[d]), 

5: Update the step (3t = (A^max - i)/(Mnax ~ !)• 
6: end for 



where P+ is the projector onto the positive orthant, i.e. 
P+{x) = max(a;, 0). ST^^ is the soft-thresholding operator 
with threshold /3t, i.e. ST/jJa;] = x — Ptsign{x) if \x\ > fin 
and otherwise. 



4.4. Algorithm summary 

The final MSVST 2D-1D wavelet denoising algorithm 
is the following: 

Require: Input noisy data F; a low-pass filter h\ thresh- 
old level r, 



1: 2D-1D-MSVST : Apply the 2D-1D-MSVST to the 

data using (fT9l) - ((22l) . 
2: Detection : Detect the significant wavelet coefficients 

that are above r, and compute the multiresolution 

support M. 

2: Reconstruction : Reconstruct the denoised data using 
the algorithm above. 

5. Experimental results and discussion 

5.1. MSVST-2D-1D versus MSVST-2D 

We have simulated a data cube according to the pro- 
cedure described in subsection 12.21 The cube contains 
several sources, with spatial positions on a grid. It con- 
tains seven columns and five rows of LAT sources (i.e. 35 
sources) with different power-law spectra. The cube size 
is 161 X 161 X 31, with a total number of photons equal 
to 25948, i.e. an average of 0.032 photons per pixel. Fig. [6] 
shows the 2D image obtained after integrating the sim- 
ulated data cube along the z-axis. Fig. [7] shows a com- 
parison between 2D-MSVST denoising of this image, and 
the image obtained by first applying a 2D-1D-MSVST de- 
noising to the input cube, and integrating afterward along 
the z-axis. Fig. [7] upper left and right show denoising re- 
sults for the 2D-MSVST with respectively threshold val- 
ues r = 3 and t — 5, and Fig. [7] bottom left and right 
show the results for the 2D-1D-MSVST using respectively 
T — 4 and r = 6 detection levels. The reason for using 
a higher threshold level for the 2D- ID cube is to correct 
for multiple hypothesis testings, and to get the same con- 
trol over global statistical error rates. Roughly speaking, 
the number of false detections increases with the num- 
ber of coefficients being tested simultaneously. Therefore, 
one must correct for multiple comparisons using e.g. the 
conservative Bonferro ni correction or the fals e disco very 
rate (FDR) procedure Benjamini &: Hochbergl ( 19951 ). As 
the number of coefficients is much higher with the whole 
2D-1D cube, the critical detection threshold t of 2D-1D 
denoising must be higher to have a false detection rate 
comparable to the 2D denoising. As we can clearly see 
from Fig. \7\ the results are very close. This means that 
applying a 2D-1D denoising on the cube instead of a 2D 
denoising on the integrated image does not degrade the 
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Figure 6. Image obtained by integrating along the z-axis of the simulated data cube. 



detection power of the MSVST. The main advantage of 
the 2D-1D-MSVST is the fact that we recover the spec- 
tral (or temporal) information for each spatial position. 
Fig. [5] shows two frames (frame 16 top left and frame 25 
bottom left) of the input cube and the same frames after 
the 2D-1D-MSVST denoising top right and bottom right. 
Fig. [5] displays the obtained spectra at two different spa- 
tial positions (112,47) and (126,79) which correspond to 
the centers of two distinct sources. 



5.2. Time-varying source detection 

We have simulated a time varying source in a cube 
of size 64 X 64 X 128. The source has a Gaussian shape 
both in space and time. It is centered in the middle of 
the cube at (32,32,64); i.e. its brightest point is at this 
location. The standard deviation of the Gaussian is 1.8 
in space (pixel unit), and 1.2 along time (frame unit). 
The total flux of the source (i.e. spatial and temporal in- 
tegration) is 100. We have added a background level of 
0.1. Finally, Poisson noise was generated. Fig. [TUl shows 
respectively from left to right an image of the original 
source, the flux per time frame and the integration of all 
noisy frames along the time axis. As it can be seen, the 
source is hardly detectable in Fig. [10] right. By running 
the 2D-MSVST denoising method on the time-integrated 
image, we were not able to detect it. Then we applied the 
2D-1D-MSVST denoising method on the noisy 3D data 
set. This time, we were able to restore the source with 
a threshold level r = 6. Fig. [TT] left depicts one frame 
(frame 64) of the denoised cube, and Fig. [11] right shows 
the flux of the recovered source per frame (dotted line). 
The solid and thick-solid lines show respectively the flux 
per time frame after background subtraction in the noisy 
data and the original noise- free data set. We can conclude 
from this experiment that the 2D-1D-MSVST is able to re- 
cover rapidly time- varying sources in the spatio-temporal 
data set, whereas even a robust algorithm such as the 2D- 
MSVST method will completely fail if we integrate along 
the time axis. This was expected since the co-addition 
of all frames mixes the few frames containing the source 
with those which contain only the noisy background. Co- 
adding followed by a 2D detection is clearly suboptimal, 
except if we repeat the denoising procedure with many 



temporal windows with varying size. We can also notice 
that the 2D- ID-MS VST is able to recover very weU the 
times at which the source flares, although the source is 
slightly spread out on the time axis and the flux of the 
source is not very well estimated, and other methods such 
as maximum likelihood should be preferred for a correct 
flux estimation, once the sources have been detected. 

5.3. Diffuse emission of the Galaxy 

In this experiment, we have sim ulated a 720 x 360 x 
128 cube using the Galprop code IStrong et al.l (j2007l ) 



that has a model of the diffuse gamma-ray emission of 
the Milky Way. The units of the pixels are photons 
cm^2s~^sr~^ MeV~^ . The gridding in Galactic longitude 
and latitude is 0.5 degrees, and the 128 energy planes are 
logarithmically spaced from 30 MeV to 50 GeV. A six 
months LAT data set was created by multiplying the sim- 
ulated cube with the exposure (6 months), and by con- 
volving each energy band with the point spread function 
of the LAT instrument. The PSF strongly varies with the 
energy. Finally we have created the noisy observations as- 
suming a Poisson noise distribution. 

Fig.[T2]left shows from top to bottom the original sim- 
ulated data, the noisy data and the filtered data for the 
band at energy 171-181 Mev. The same figures for the 
band 9.87-1.04 GeV are shown in Fig. [H right. 



6. Conclusion 

The motivations for a reliable nonparametric source 
detection algorithm to apply to Fermi LAT data are clear. 
Especially for the relatively short time ranges over which 
we will want to study sources, the data will be squarely in 
the low counts regime with widely varying response func- 
tions and significant celestial foregrounds. In this paper, 
we have shown that the MSVST, associated with a 2D-1D 
wavelet transform, is a very efficient way to detect time- 
varying sources. The proposed algorithm is as powerful 
as the 2D-MSVST applied to co-added frames to detect a 
source if the latter is slowly varying or constant over time. 
But when the source is rapidly varying, we lose some de- 
tection power when we co-add frames having no source 
and those containing the sources. Our approach gives us 
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Figure 7. Top, 2D-MSVST filtering on the integrated image with respectively a r = 3 and a r = 5 detection level. 
Bottom, integrated image after a 2D-1D-MSVST denoising of the simulated data cube, with respectively a r = 4 and 
a r = 6 detection level. 



Figure 8. Top, frame number 16 of the input cube and the same frame after the 2D-1D-MSVST filtering at 6a . 
Bottom, frame number 25 of the input cube and the same frame after the 2D- ID-MS VST filtering at 6a . 



Figure 9. Pixel spectra at two different spatial locations after the 2D-1D-MSVST filtering. 



an alternative to frame-eo-adding and outperforms the 2D 
algorithms on the co-added frames. Unlike 2D denoising, 
our method fully exploits the information in the 3D data 
set and allows to recover the source dynamics by detecting 
temporally varying sources. 
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noisy data after background subtraction (solid line), for the original noise- free cube (thick-solid line) and for the 
recovered source (dashed line). 
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band 171-181 Mev, noisy simulated data and filtered data using the MSVST. Right, same images for energy band 
9.87-1.04 GeV. 
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